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ARTICLE INFO ABSTRACT 



The pathogenic bacteria Streptococcus pyogenes can cause an array of diseases in humans, including 
moderate infections such as pharyngitis (strep throat) as well as life threatening conditions such as 
necrotizing fasciitis and puerperal fever. The antigen I/II family proteins are cell wall anchored 
adhesin proteins found on the surfaces of most oral streptococci and are involved in host coloniza- 
tion and biofilm formation. In the present study we have determined the crystal structure of the 
C 2 - 3 -domain of the antigen I/II type protein AspA from S. pyogenes M type 28. The structure was 
solved to 1.8 A resolution and shows that the C 2 _3-domain is comprised of two structurally similar 
DEv-IgG motifs, designated C 2 and C 3 , both containing a stabilizing covalent isopeptide bond. 
Furthermore a metal binding site is identified, containing a bound calcium ion. Despite relatively 
low sequence identity, interestingly, the overall structure shares high similarity to the C 2 _3-domains 
of antigen I/II proteins from Streptococcus gordonii and Streptococcus mutans, although certain parts 
of the structure exhibit distinct features. In summary this work constitutes the first step in the full 
structure determination of the AspA protein from S. pyogenes. 

© 2014 The Authors. Published by Elsevier B.V. on behalf of the Federation of European Biochemical Societies. This 
is an open access article under the CC BY-NC-ND license (http://creativecommons.Org/licenses/by-nc-nd/3.0/). 
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1. Introduction 

The Gram-positive bacterium Streptococcus pyogenes (or Group A 
Streptococcus; GAS) is a causative agent of many human diseases, 
including pharyngitis (strep throat), impetigo, scarlet fever, celluli- 
tis, and also potentially life threatening invasive disease such as 
necrotizing fasciitis. In order to colonize and proliferate on host tis- 
sues S. pyogenes must first adhere to the host cell surface, where it 
has the potential to form biofilms in the oral cavity, nasopharynx 
and on skin and wounds. Several mechanisms for biofilm formation 
by different strains and serotypes of S. pyogenes have been pro- 
posed. The first involves the formation of complexes between cell 
surface lipoteichoic acid (LTA) and members of the M protein fam- 
ily, which together control the hydrophobic properties of the cell 
surface, thereby mediating adhesion and biofilm formation in Ml 
serotype S. pyogenes [1]. A second mechanism involves the use of 
pili structures on the bacterial cell surface [2,3]. A third alternative 
is that biofilm formation may be mediated by so called antigen I/II 
(Agl/II) type proteins. The Agl/II proteins are a family of cell surface 
adhesins which previously have been shown to play many roles in 
host colonization in the oral viridans group of streptococci. The pro- 
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teins of the Agl/II family are large, comprising 1310-1653 amino 
acids, which have a distinct domain organization [4]. The primary 
sequences of Agl/II proteins begin with a signal peptide, approxi- 
mately 30-40 amino acid residues in length, followed in sequence 
by a small N-terminal domain, an alanine rich repeat region (A), a 
variable domain (V), a proline rich repeat region (P), a C-terminal re- 
gion, and finally a cell wall anchor segment containing an LPxTG- 
motif. The LPxTG sequence is recognized by housekeeping sortases, 
which mediate cell wall anchoring and presentation of the Agl/II 
protein on the bacterial surface. The crystal structures of several do- 
mains of the Agl/II proteins from the oral streptococci Streptococcus 
mutans and Streptococcus gordonii revealed that the Agl/II proteins 
fold into highly elongated structures [5-9], with the V-domain 
being presented at the tip of the protein, furthest away from the cell 
wall [4]. The elongated structure is achieved by the interaction of 
the A-domain oe-helical repeat with the P-domain polyproline II 
(PPII) helix, which intertwine, creating a supercoiled fibrillar stalk 
structure, projecting the V-domain >50 nm out from the cell surface 
[4,6]. The N- and C-domains are the domains that are closest to the 
cell wall, and form the base of the protein. 

In the genome of S. pyogenes serotype M28, a single Agl/II type 
protein is encoded by the aspA gene. The gene product, AspA, is 
comprised of 1352 amino acids and is expected to exhibit the 
typical Agl/II protein domain architecture (Fig. 1 A). Similarly as ob- 
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served for SpaP from S. mutans and SspA and SspB from S. gordonii, 
the AspA protein contains three repeat regions in the A-domain, 
three repeat regions in the P-domain, and also the C-domain can 
be divided into three subdomains. Although the general structural 
characteristics are predicted to be similar, the overall sequence 
identity between AspA and SpaP is only 27% (and 28% between 
AspA and SspA/B). A sequence alignment of the C 2 _3 regions of 
AspA, SpaP and SspB is shown in Fig. IB. Especially the AspA 
V-domain, which shares only 8% sequence identity with the 
V-domain of SpaP, appears to be functionally distinct from those 
of SpaP, SspA and SspB. Interestingly, the V-domain shares high 
sequence similarity with the V-domain of the Agl/II protein BspD 
from Streptococcus agalactiae, and they both contain conserved 
histidine-aspartic acid clusters, typically associated with divalent 
metal binding sites [4]. In agreement with this observation it was 
recently shown that the V-domain of AspA could indeed bind diva- 
lent metal ions, specifically Zn 2+ ions [10]. However, the functional 
significance of the Zn 2+ binding remains unknown. In the same 
study it was also concluded that AspA can bind immobilized 
salivary agglutinin gp-340 and that deletion of the aspA gene 
abolished the abilities of two different M28 strains of S. pyogenes 
to form biofilms on saliva coated surfaces [10]. 



In order to increase our understanding of the mechanisms and 
function of the AspA protein in 5. pyogenes host colonization and 
biofilm formation we have undertaken to determine the crystal 
structure of the AspA polypeptide. In this study we present the 
structure of the C 2 and C 3 components of the C-domain, deter- 
mined by X-ray crystallography to 1 .8 A resolution. 

2. Materials and methods 

2.2. Cloning 

The C-terminal domain of AspA was cloned from plasmid 
pET-46 Ek/LIC-NAVPC [10] encoding the S. pyogenes aspA gene 
(GenelD: 3574034). PCR primers were designed based on the crys- 
tal structure of the S. gordonii SspB C 2 _3-domain. Forward primer 
was 5 / -TTTTTCCATGGATAATCTGATTCAGCCAACT-3 / and reverse 
5' -AAAAAGGTACCTTACGTATGAGTGGTG ACTTT-3' . The PCR product 
was digested with Acc65I and Ncol and ligated into the equivalent 
sites of the pET-Ml 1 expression vector. The final construct encodes 
MKHHHHHHPM-AspA-C 2 -3. The plasmids were transformed into 
Escherichia coli DH5oe and subsequently selected on kanamycin 
plates. The positive clones were verified by DNA sequencing. 
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AspA (Q48S75) 971 DNLIQPTKTIVDDKGQSIDGKSVLPNSTLTYVAKQDFDQYKGMTAAKESVMKGFIYVDDY 
SpaP (P23504) 1150 NNYIKPTKVNKNENGVVIDDKTVLAGSTNYYELTWDLDQYKNDRSSADTIQKGFYYVDDY 
SspB (PI 6952) 1075 NNYIKPTKVNKNKEGLNIDGKEVLAGSTNYYELTWDLDQYKGDKSSKEAIQNGFYYVDDY 

D D' DH1 (BAR-helix) D" 



AspA (Q48S75) 1031 KDEAIDGHSLVVNSIKAANGDDVTNLLEMRHVLSQDTLDDKLKALIKASGI SPVGEFYMW 
SpaP (P23504) 1210 PEEALELRQDLVKI -TDANGNEVTGV-SVDNYTSLEAAPQEIRDVLSKAGIRPKGAFQIF 
SspB (PI 6952) 1135 PEEALDVRPDLVKV-ADEKGNQVSGV-SVQQYDSLEAAPKKVQDLLKKANITVKGAFQLF 
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AspA (Q48S75) 1091 VAKDPAAFYKAYVQKGLDITYNLSFKLKQDF--KKGDITNQTYQIDFGNGYYGNIVVNHL 

SpaP (P23504) 12 68 RADNPREFYDTYVKTGIDLKIVSPMVVKKQMGQTGGSYENQAYQIDFGNGYASNIVINNV 
SspB (PI 6952) 1193 SADNPEEFYKQYVATGTSLVITDPMTVKSEFGKTGGKYENKAYQIDFGNGYATEVVVNNV 



A B C 



AspA (Q48S75) 1149 SELTVHKDVF DKEGGQSINAGTVKVGDEVTYRLEGWVVPTNRGYDLTEYKFVDQLQH 

SpaP (P23504) 1328 PKINPKKDVTLTLDPADTNNVDGQTIPLNTVFNYRLIGGI IPANHSEELFEYNFYDDYDQ 
SspB (PI 6952) 1253 PKITPKKDVTVSLDPTS-ENLDGQTVQLYQTFNYRLIGGLIPQNHSEELEDYSFVDDYDQ 

D D' D" D'" D"" DH1 

■ > "=> ^> I > I > 



AspA (Q4 8S75) 12 0 6 THDLYQK-DKVLATVDITLSDGSVITKGTDLAKYTETVYNKETGHYELAFKQDFLAKVVR 
SpaP (P23504) 1388 TGDHYTGQYKVFAKVDI ILKNGVI IKSGTELTQYTTAEVDTTKGAITIKFKEAFLRSVS I 
SspB (PI 6952) 1312 AGDQYTGNYKTFSSLNLTMKDGSVIKAGTDLTSQTTAETDATNGIVTVRFKEDFLQKISL 



=> > 



AspA (Q4 8S75) 12 65 SSEFGADAFVVVKRIKAGDVANEYTLYVNGNPVKSNKVTTHTP 1307 
SpaP (P23504) 1448 DSAFQAESYIQMKRIAVGTFENTYINTVNGVTYSSNTVKTTTP 1490 
SspB (PI 6952) 1372 DSPFQAETYLQMRRIAIGTFENTYVNTVNKVAYASNTVRTTTP 1414 



Fig. 1. Domain organization of 5. pyogenes AspA and sequence alignment of the C 2 _3 domains of AspA, SpaP and SspB. (A) The primary sequence of AspA begins with a signal 
peptide (SP), followed in sequence by a small N-terminal domain (N), an alanine rich repeat region (A), a variable domain (V), a proline rich repeat region (P), a C-terminal 
region (C), and finally a cell wall anchor segment (CW) containing an LPxTG-motif for sortase mediated cell wall attachment. (B) Sequence alignment of the AspA (Uniprot id. 
Q48S75), SpaP (P23404) and SspB (PI 6952) C 2 _ 3 domains. Secondary structure elements from the structure of AspA-C 2 _ 3 are marked above the alignment. 
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2.2. Over expression and purification 

The AspA-C 2 _3 construct was overexpressed in E. coli BL21 (DE3) 
at 310 K in Luria Broth supplemented with 50 ug mL -1 kanamycin. 
When the cultures reached an OD 60 o of 0.6, the temperature was 
lowered to 303 K and the expression was induced with 0.4 mM 
IPTG after which the cultures were grown for an additional 4 h. 
Cells were harvested by centrifugation at 5300g and the pellets 
were frozen at 193 K. Cell pellets were resuspended in 50 mM 
NaH 2 P0 4 pH 7.5, 300 mM NaCl and 10 mM imidazole supple- 
mented with EDTA-free protease inhibitor cocktail (Roche). The 
suspension was lysed on ice by sonication and cellular debris 
was removed by centrifugation at 39,000g for 60 min. The superna- 
tant was loaded onto a column packed with Ni-NTA agarose 
(Qiagen). The protein was washed in 50 mM NaH 2 P0 4 pH 7.5, 
300 mM NaCl and 20 mM imidazole and eluted with 50 mM 
NaH 2 P0 4 pH 7.5, 300 mM NaCl and 300 mM imidazole. The buffer 
was exchanged to 30 mM Tris-HCl pH 8.0, 200 mM NaCl, 1 mM 
EDTA and 0.5 mM TCEP. The protein was further purified by size- 
exclusion chromatography using a HiLoad™ 16/60 Superdex™ 
200 prep-grade column (GE Healthcare). The protein purity was 
judged by SDS-PAGE to be >95% and the protein was finally con- 
centrated to 85 mg mL -1 in 20 mM TrisHCl pH 8.0 using an Amicon 
Ultra centrifugal filter device (Millipore). 

2.3. Crystallization, data collection and processing 

Initial crystallization trials were performed by the sitting-drop 
vapor-diffusion method in 96-well MRC-crystallization plates 
(Molecular Dimensions) using a Mosquito (TTP Labtech) pipetting 
robot and standard crystal screening kits (Hampton Research and 
Molecular Dimensions). Crystals were obtained in condition 44 of 
the MIDAS screen (20% glycerol ethoxylate, 10% tetrahydrofuran 
and 0.1 M Tris-HCl pH 8.0). This condition was subsequently opti- 
mized using the hanging-drop method by mixing 1 uL protein, at a 
concentration of 85 mg mL" 1 and 2 uL reservoir solution. Crystals 
suitable for data collection were obtained after 3 weeks. Diffraction 
data from a single crystal were collected on a Bruker Proteum X8 
X-ray diffraction system, equipped with a MICROSTAR-H rotating 
anode generator and a Platinum 135 detector. The diffraction 
images were integrated with SAINT and scaled and merged using 
SADABS, components of the Proteum software (Bruker). 

2.4. Structure determination and refinement 

The structure was solved using the BALBES molecular replace- 
ment pipeline [11] and automated model building was performed 
using phenix.autobuild [12]. The model was completed by iterative 
rounds of manual model building using COOT [13] and refinement 
using phenix.refine [14]. Translation-libration-screw (TLS) 
refinement was used in the last rounds of refinement, treating each 
domain (amino acids 970-1146 for the C 2 domain and 1147-1306 
for the C 3 domain) as separate TLS groups. Evolutionary conserva- 
tion analysis was performed using ConSurf [15] and figures were 
prepared using CCP4MG [16] and PyMol [17]. 

3. Results and discussion 

3A. Crystallization and structure determination 

The full length AspA protein consists of 1352 amino acid 
residues, with the C-terminal domain constituting 511 residues. 
A construct representing the C 2 and C 3 components of the 
C-terminal domain (residues 971-1307) was expressed, purified 
and crystallized by the hanging drop vapor diffusion method. The 
crystals diffracted to a maximum resolution of 1.8 A and belonged 



to space group P2i2i2, with unit cell parameters a = 99.0, b = 105.0, 
c = 74.5 A. Relevant processing, refinement and model quality 
statistics are presented in Table 1. The asymmetric unit contained 
two AspA-C 2 _ 3 molecules (A and B). The structure was solved by 
molecular replacement and subsequently automated model build- 
ing was performed. The model was completed by iterative manual 
model building and refinement, finally yielding a model with an 
Kwork of 18.2% and Rf ree of 23.1%. The final model is well ordered 
with an average atomic displacement factor of 22.8 A 2 . The struc- 
ture consists of residues 971-1306 in chain A and residues 972- 
1306 in chain B, using numbering based on the full length AspA pro- 
tein. Additionally 984 water molecules are included in the model. 
No electron density was observed for the flexible poly-histidine 
tag and linker region and they were hence not included in the 
model. 

3.2. Overall structure 

The overall topology of the AspA-C 2 _ 3 structure is presented in 
Fig. 2, and is comprised of two distinct domains, referred to as the 
C 2 - and C 3 -domains (residues 971-1149 and 1150-1306 respec- 
tively). Together these two domains form an elongated structure, 
approximately 95 A long and 35 A wide. Each domain adopts the 
DEv-IgG fold [18], which similarly to the classical IgG folds is com- 
prised of two major antiparallel (3-sheets, designated ABED and 
CFG. The ABED sheet is formed by the A, B, E and D strands while 
the CFG sheet is correspondingly formed by strands C, F and G. The 
main variation from the classical IgG folds, including additional 
helices and strands, is found between the D and E strands. For 
the C 2 -domain, there are two additional strands on the CFG sheet, 
designated D' and D" as well as two oe-helices, DH1 and DH2, 
located respectively on the loop region between strands D' and 



Table 1 

Data collection, refinement and model statistics. 



Wavelength (A) 
Resolution range (A) 
Space group 
Unit cell (A, °) 
Total reflections 
Unique reflections 
Multiplicity 
Completeness (%) 
Mean ljo{l) 
Wilson B-factor (A 2 ) 
R-merge a (%) 
R-work b (%) 
R-free b (%) 
Number of atoms 

Macromolecules 

Ligands 

Water 
Protein residues 
RMS bonds (A) 
RMS angles (°) 
Ramachandran favored (%) 
Ramachandran outliers (%) 
Average B-factor (A 2 ) 

Macromolecules (A 2 ) 

Metal ions (A 2 ) 

Solvent (A 2 ) 
PDB code 



1.5418 

31.67-1.80 (1.86-1.80) 
P2{2{2 

a = 99.0, b = 105.0 c = 74.4, a, /?, y = 90 

779400 (26976) 

70996 (6553) 

10.8 (4.1) 

97.6 (82.7) 

24.6 (3.2) 

15.5 

6.2 (41.1) 
18.2 (31.1) 
23.1 (35.8) 
6365 
5379 
2 

984 

672 

0.007 

1.04 

99 

0 

22.8 
21.0 
13.1 
32.6 
40FQ 



Values in parentheses indicate statistics for the highest resolution shell. 

a Emerge = ^hki \h{hkl) - <J(fr/d)>|/E h/d E; U (hkl), where l£hkl) is the intensity of 
the ith observation of reflection hkl and <I(hkl)> is the average over of all obser- 
vations of reflection hkl. 

b K W ork = £||fobsl - |F ca i c ||/Z|F obs |, where F obs and F cak are the observed and cal- 
culated structure factor amplitudes, respectively. Rf ree is equivalent to R WO rk but is 
calculated using a 5% randomly selected set of reflections which is excluded from 
refinement. 
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DH1 




DH1 





ABED 




Fig. 2. Overall topology of AspA-C 2 _ 3 . (A) Ribbon diagram of AspA-C 2 _ 3 presented in stereo. The C 2 domain (residues 971-1150) is depicted in blue while the C 3 domain 
(1151-1306) is depicted in yellow. A bound calcium ion is shown as a grey sphere. (B) Topology diagram of AspA-C 2 _ 3 where oc-helices are represented as rectangles, p- 
strands as arrows and loops as lines. The isopeptide bonds between K987 and Nil 28 and between K1155 and N1286 are marked as red lines. (C) Electrostatic surface 
potential rendering of AspA-C 2 _ 3 colored from -0.5 V (red) to 0.5 V (blue). (For interpretation of the references to colour in this figure legend, the reader is referred to the web 
version of this article.) 



D" and between strands D" and E. In addition an oe-helix (BH1) is 
located in the loop region between strands B and C. The central 
DEv-IgG motif of the C 3 -domain is in a similar manner formed from 
the four stranded ABED and three stranded CFG main p-sheets. 
Two additional strands (D'" and D"") extend the CFG sheet into a 
five stranded sheet. As for the C 2 -domain, sheets ABED and CFG 
are interconnected by several cross-connecting loops and one 
oe-helix (DH1) between strand D"" of the CFG sheet and strand E 



of the ABED sheet. The C 2 - and C 3 -domain are connected by a linker 
extending from strand G in the C 2 -domain to strand A in the 
C 3 -domain. Additionally, in the interface region between the two 
domains, the side chains of D982 and N996 in the C 2 -domain are 
involved in hydrogen bonding with the side chains of R1264 and 
N1295 in the C 3 domain. Main chain hydrogen bonding can also 
be observed between S992 in C 2 and N1189/G1191 in C 3 , 
furthermore stabilizing the interaction between the domains. 
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Finally the main chain atoms of the interface region exhibit low 
temperature factors, in the range similar to those for main chain 
atoms of the central DEv-IgG folds, suggesting that the domains 
are fixed in position and that the structure is rigid. The C 2 domain 
contains one bound metal ion, modeled as Ca 2+ , and both the 
C 2 - and C 3 -domains are stabilized by conserved isopeptide bonds, 
which connect the p-sheets of the central DEv-IgG motifs. An 
electrostatic surface potential rendering shows that positively 
and negatively charged residues are heterogeneously distributed 
across the protein surface (Fig. 2C). 

3.3. Isopeptide bonds and metal binding sites 

An isopeptide bond can be observed between K978 (on strand 
A) and Nil 28 (strand F) in domain C 2 , forming a covalent link be- 
tween the SI and S2 p-sheets of the p-sandwich motif. The interac- 
tion is completed by hydrogen bonding between the C=0 and NH 
isopeptide groups and the side chain of D1028 (Fig. 3A). Similarly 
K1155 (strand A) and N1286 (strand F) in domain C 3 are connected 
by an isopeptide bond, stabilized by the side chain of D1201. The 
isopeptide bonds are surrounded by aromatic and hydrophobic 
residues. This may result in an increased p/C a of the aspartic acid 
and decreased pK a of the lysine residue, facilitating nucleophilic at- 
tack on the Asn CG carbon by the deprotonated lysine amino group, 
finally yielding a covalent bond and the release of ammonium [19]. 
This type of self-generated stabilizing isopeptide bond has now 
been identified in a number of different pilin/adhesin proteins 
from Gram-positive bacteria [5,7,19-22]. Probably the extra struc- 
tural stabilization inferred by the isopeptide bonds is essential for 
resistance against physical and chemical stress. Oral bacteria, for 
example, are constantly exposed to strong physical shear forces 
from salivary flow as well as tongue movement, and must remain 
firmly attached to host surfaces in order to persist and colonize. 

The previously resolved crystal structures of the C-terminal 
domains of SspB and SpaP, Agl/II type proteins from 5. gordonii 
and S. mutans, respectively, revealed tightly bound Ca 2+ ions in 
both the C 2 - and C 3 -domains [5,7,8]. Also in the AspA-C 2 _ 3 
structure Ca 2+ binding can be observed (Fig. 3B), but only in the 
C 2 -domain, where a single Ca 2+ ion is coordinated by D1029 
(OD1) and Y1030 (0) from the strands C-D loop, VI 084 (0) and 
El 086 (0) from the loop connecting helix DH1 with strand D", 
and one water molecule. The position of the Ca 2+ ion correlates 
well with that observed in the SspB-C 2 _ 3 and SpaP-C 2 _ 3 structures, 



and the binding residues on the strands C-D loop are highly 
conserved. 

3.4. Comparative structural and evolutionary conservation analyses 

While 5. mutans SpaP-C 2 _ 3 and S. gordonii SspB-C 2 _ 3 share 64% 
sequence identity (for 332 aligned amino acids), Asp-C 2 _ 3 only 
shares 36% (326 aligned amino acids) and 34% (317 aligned amino 
acids) sequence identity with SpaP-C 2 _ 3 and SspB-C 2 _ 3 , respec- 
tively. Despite the low overall sequence identity the structure of 
AspA-C 2 _ 3 is very similar to that of 5. mutans SpaP-C 2 _ 3 (r.m.s.d. 
1.84 A) and S. gordonii SspB-C 2 _ 3 (r.m.s.d 1.9 A) (Fig. 4A). The main 
region of difference is found in and in proximity to the oe-helix 
(DH1 ) lying perpendicular to the C 2 -domain central DEv-IgG motif. 
In S. gordonii SspB this region has been described as a recognition 
handle for the short fimbria Mfal from the periodontal pathogen 
Porphyromonas gingivalis and is referred to as BAR (for SspB Adher- 
ence Region) [23]. Within BAR, two structural motifs, correspond- 
ing to the DH1 -helix and the loop region following it, have been 
identified as being important for P. gingivalis binding [5,24]. Inter- 
estingly, although the structures of SpaP-C 2 _ 3 and SspB-C 2 _ 3 are 
highly similar, also in secondary structure in BAR [7], P. gingivalis 
does not bind to 5. mutans SpaP [23]. It rather appears that the bind- 
ing is dependent on the specific composition of BAR, perhaps espe- 
cially on the surface charge distribution around the oe-helix. The 
AspA-C 2 _ 3 BAR helix also has a variable amino acid sequence, 
(DDKLKALIKAS, as compared to KKVQDLLKK in SspB and 
QEIRDVLSK in SpaP) which gives BAR a different surface charge 
distribution pattern compared to both SspB and SpaP (Fig. 4B). Also 
in the AspA-C 2 _ 3 structure, residues D1066 and T1067 produce a 
distinct bulge in the short loop preceding the BAR helix, which pro- 
trudes from the protein surface. The interactions in which BAR of 
AspA may play a role and how these structural features may be in- 
volved remains unknown and requires further study. Of additional 
note is that all three Agl/II type proteins for which the structure of 
the C 2 _ 3 domain has been determined so far (AspA, SspB, SpaP) have 
bound Ca 2+ ions located after the short loop following the DH1 oe- 
helix, which are thought to stabilize the position of the helix. In or- 
der to better understand the differences and variability among the 
C-terminal domains of the Agl/II type proteins, the AspA-C 2 _ 3 struc- 
ture was subjected to evolutionary conservation analysis using 
ConSurf [16,25]. For the ConSurf analysis 51 homologous sequences 
were collected using a BLAST search and subsequently used for 



(A) (B) 




Fig. 3. Isopeptide bonds and metal binding site. (A) The isopeptide bond between K978 on strand A and Nil 28 on strand F in the C 2 domain is represented as a stick model in 
a 2Fo-Fc map, contoured at 0.90 e/A 3 . It is stabilized by hydrogen bonding with the side chain of D1028 (dashed lines). (B) In the C 2 domain a metal ion, modeled as a Ca 2+ ion, 
is coordinated by D1029 and Y1030 on the strands C-D loop, VI 084 and El 086 from the loop connecting helix DH1 with strand D", and one water molecule. The figure is 
presented in stereo. 
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BAR 






Fig. 4. Comparative structural and evolutionary conservation analyses of AspA-C 2 _ 3 . (A) Superposition of the AspA-C 2 _ 3 structure (blue) with the structures of SspB-C 2 - 3 from 
5. gordonii (orange, PDB: 2WOY) and SpaP-C 2 _ 3 from S. mutans (green, PDB: 30PU). The BAR (SspB Adherence Region), recognized by P. gingivalis short fimbrial protein Mfal, is 
highlighted by a black circle and is shown in more detail as a stereo image in (B) together with a sequence alignment of the BAR helix. (C) Space filling model representation of 
evolutionary conservation analyses for AspA-C 2 _ 3 performed using ConSurf [25]. The level of conservation of individual amino acids is indicated from variable (turquoise, 1 ) to 
highly conserved (maroon, 9) according to the color coding bar. Positions for which the level of conservation was assigned with low confidence are marked with yellow color. 
(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) 



multiple sequence alignment and evolutionary conservation analy- 
sis. The level of conservation of each amino acid, on a scale from 1 -9 
(where 1 = variable and 9 = very highly conserved), was mapped 
onto the structural model (Fig. 4C). In good agreement with the 
observations discussed above, the conservation analysis shows that 
BAR, and the BAR helix in particular, is highly variable, while the 
closely located Ca 2+ binding pocket in contrast is well conserved. 
Generally the [3-sheet residues with side chains pointing to the inte- 
rior of the central DEv-IgG motifs in each domain, constituting the 
hydrophobic core, are well conserved, while those with surface ex- 
posed side chains are variable. This is especially evident for one side 
of the C 3 domain (Fig. 4C, right). 

4. Conclusions 

In conclusion we have here presented the 1 .8 A crystal structure 
of the C 2 and C 3 domains of the Agl/II type adhesin protein AspA 
from S. pyogenes. This is the first part of the AspA protein to be 
structurally characterized and this work, together with successful 
structure determination of the other domains, especially of the 
variable (V) domain, will give valuable insights into the molecular 
mechanisms underlying adhesion and infection by S. pyogenes. 
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